The examples in this notebook introduce features of BigQuery Standard SQL and BigQuery SQL Data Manipulation Language (beta). BigQuery Standard SQL is compliant with the SQL 2011 standard. You've already seen the use of the magic command %%bq
in the Hello BigQuery and BigQuery Commands notebooks. This command and others in the Google Cloud Datalab API support BigQuery Standard SQL.
In [54]:
%%bq query --name UniqueNames2013
WITH UniqueNames2013 AS
(SELECT DISTINCT name
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE Year = 2013)
SELECT * FROM UniqueNames2013
Now let's list all available commands to work with %%bq
In [55]:
%%bq -h
The dryrun
argument in %%bq
can be helpful to confirm the syntax of the SQL query. Instead of executing the query, it will only return some statistics:
In [56]:
%%bq dryrun -q UniqueNames2013
Out[56]:
Now, let's get a small sample of the results using the sample
argument in %%bq
:
In [57]:
%%bq sample -q UniqueNames2013
Out[57]:
Finally, We can use the execute
command in %%bq to display the results of our query:
In [58]:
%%bq execute -q UniqueNames2013
Out[58]:
In [1]:
import google.datalab.bigquery as bq
In [2]:
# Create a new dataset (this will be deleted later in the notebook)
sample_dataset = bq.Dataset('sampleDML')
if not sample_dataset.exists():
sample_dataset.create(friendly_name = 'Sample Dataset for testing DML', description = 'Created from Sample Notebook in Google Cloud Datalab')
sample_dataset.exists()
In [67]:
# To create a table, we need to create a schema for it.
# Its easiest to create a schema from some existing data, so this
# example demonstrates using an example object
fruit_row = {
'name': 'string value',
'count': 0
}
sample_table1 = bq.Table("sampleDML.fruit_basket").create(schema = bq.Schema.from_data([fruit_row]),
overwrite = True)
In [68]:
%%bq query
INSERT sampleDML.fruit_basket (name, count)
VALUES('banana', 5),
('orange', 10),
('apple', 15),
('mango', 20)
Out[68]:
You may rewrite the previous query as:
In [69]:
%%bq query
INSERT sampleDML.fruit_basket (name, count)
SELECT *
FROM UNNEST([('peach', 25), ('watermelon', 30)])
Out[69]:
You can also use a WITH
clause with INSERT
and SELECT
.
In [70]:
%%bq query
INSERT sampleDML.fruit_basket(name, count)
WITH w AS (
SELECT ARRAY<STRUCT<name string, count int64>>
[('cherry', 35),
('cranberry', 40),
('pear', 45)] col
)
SELECT name, count FROM w, UNNEST(w.col)
Out[70]:
Here is an example that copies one table's contents into another. First we will create a new table.
In [71]:
fruit_row_detailed = {
'name': 'string value',
'count': 0,
'readytoeat': False
}
sample_table2 = bq.Table("sampleDML.fruit_basket_detailed").create(schema = bq.Schema.from_data([fruit_row_detailed]),
overwrite = True)
In [72]:
%%bq query
INSERT sampleDML.fruit_basket_detailed (name, count, readytoeat)
SELECT name, count, false
FROM sampleDML.fruit_basket
Out[72]:
In [73]:
%%bq query
UPDATE sampleDML.fruit_basket_detailed
SET readytoeat = True
WHERE name = 'banana'
Out[73]:
To view the contents of a table in BigQuery, use %%bq tables view
command:
In [ ]:
%%bq tables view --name sampleDML.fruit_basket_detailed
In [75]:
%%bq query
DELETE sampleDML.fruit_basket
WHERE name in ('cherry', 'cranberry')
Out[75]:
Use the following query to delete the corresponding entries in sampleDML.fruit_basket_detailed
In [76]:
%%bq query
DELETE sampleDML.fruit_basket_detailed
WHERE NOT EXISTS
(SELECT * FROM sampleDML.fruit_basket
WHERE fruit_basket_detailed.name = fruit_basket.name)
Out[76]:
In [3]:
# Clear out sample resources
sample_dataset.delete(delete_contents = True)